BOAI: Fast Alternating Decision Tree Induction Based on Bottom-Up Evaluation
نویسندگان
چکیده
Alternating Decision Tree (ADTree) is a successful classification model based on boosting and has a wide range of applications. The existing ADTree induction algorithms apply a “top-down” strategy to evaluate the best split at each boosting iteration, which is very time-consuming and thus is unsuitable for modeling on large data sets. This paper proposes a fast ADTree induction algorithm (BOAI) based on “bottom-up” evaluation, which offers high performance on massive data without sacrificing classification accuracy. BOAI uses a pre-sorting technique and dynamically evaluates splits by a bottom-up approach based on VW-group. With these techniques, huge redundancy in sorting and computation can be eliminated in the tree induction procedure. Experimental results on both real and synthetic data sets show that BOAI outperforms the best existing ADTree induction algorithm by a significant margin. In the real case study, BOAI also provides better performance than TreeNet and Random Forests, which are considered as efficient classification models.
منابع مشابه
A framework for bottom-up induction of oblique decision trees
Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes...
متن کاملBottom-up automata on data trees and vertical XPath
A data tree is a tree whose every node carries a label from a finite alphabet and a datum from some infinite domain. We introduce a new model of automata over unranked data trees with a decidable emptiness problem. It is essentially a bottom-up alternating automaton with one register, enriched with epsilon-transitions that perform tests on the data values of the subtree. We show that it capture...
متن کاملGlobal Induction of Decision Trees: From Parallel Implementation to Distributed Evolution
In most of data mining systems decision trees are induced in a top-down manner. This greedy method is fast but can fail for certain classification problems. As an alternative a global approach based on evolutionary algorithms (EAs) can be applied. We developed Global Decision Tree (GDT) system, which learns a tree structure and tests in one run of the EA. Specialized genetic operators are used,...
متن کاملAn Improved Lower Bound for the Randomized Decision Tree Complexity of Recursive Majority,
We prove that the randomized decision tree complexity of the recursive majority-of-three is Ω(2.55), where d is the depth of the recursion. The proof is by a bottom up induction, which is same in spirit as the one in the proof of Saks and Wigderson in their 1986 paper on the complexity of evaluating game trees. Previous work includes an Ω (
متن کاملMulticlass Alternating Decision Trees
The alternating decision tree (ADTree) is a successful classification technique that combines decision trees with the predictive accuracy of boosting into a set of interpretable classification rules. The original formulation of the tree induction algorithm restricted attention to binary classification problems. This paper empirically evaluates several wrapper methods for extending the algorithm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008